Discrete Marks

Dependencies

Code
source("utils.R")

Setup

Code
spe <- readRDS("../data/spe.rds")

#subset the data to only look at sample ID 0.01, 0.06 and 0.26
# list("-0.29", "0.01", "0.06")
#zstack_list <- list("-0.04", '-0.09', '-0.14', '-0.19', '-0.24', '-0.29', '0.01', '0.06', '0.11', '0.16', '0.21', "0.26")

#define the Z-stacks that you want to compare
zstack_list <- list("-0.09", "0.01", "0.21")

#define the celltype that you want to compare across the stacks - hereby we assume independence across the z-stacks which is an assumption that can be challenged
celltype_ls <- "OD Mature"

selectZstacks <- function(zstack, spe){
  sub <- spe[, spe$sample_id == zstack]
  pp <- .ppp(sub, marks = "cluster_id")
  return(pp)
}
pp_ls <- lapply(zstack_list, selectZstacks, spe)
names(pp_ls) <- zstack_list

The theory of spatial point patterns is discussed in great detail in (Baddeley, Rubak, and Turner 2015). The book has an accompanying package called spatstat which offers great functionality to the theoretical concepts discribed in the book (Baddeley and Turner 2005). This chapter relies heavily on both publications.

Concepts and Definitions of Point Processes

Point Process

In point pattern analysis we assume that the patterns we observe are a realisation of a stochastic process called a point process. The inferences we make about the point pattern are based on the point process. E.g. the pattern can be said to be created by a poisson point process and thus is evenly distributed in the observation window (Baddeley, Rubak, and Turner 2015, 127).

When considering a pattern with \(m\) multiple types, as we do in the (Moffitt et al. 2018) dataset, there are two very closely related concepts. One can view the pattern as a multitype point pattern, where all the points are sampled from the same point process. The other option is to consider the pattern as a multivariate point pattern, where the points come from \(m\) distinct point processes. The difference between these two views is that in the multitype framework we assume the points to stem from the same point process and thus depend on each other. In the multivariate framework we assume that the types stem from independent point processes and therefore we can consider dependencies of one type alone. Whether or not the underlying point processes are independent depends on the biological question. If we analyse two celltypes in one slice of a tissue, we should consider them as being sampled from one point process. However, if we consider the distribution of a celltype in two slices of the same tissue we can have grounds to consider the point processes as independent (Baddeley, Rubak, and Turner 2015, 565).

In the following we will often compare the distribution of mature oligodendrocytes (OD mature cells) across different z-slices of the same tissue. We assume these slices to be enough far away to be considered independent. Therefore, we are in a multivariate setting. Sometimes, we show examples of methods for one celltype in a slice. This would correspond to a univariate setting.

Observation Windows

The most common set up in point pattern analysis is what we call window sampling. Instead of observing the entire pattern we observe a subset of this pattern in the so called window. In the analysis we try to make inference on the entire point process based on the observed window. An example could be different small microscopy windows through which a big tissue slice is observed. The windows would be samples of the bigger point process. In this case, it would be wrong to assume the window to be the convex hull around the observed points because they are just a sample of the bigger point pattern (Baddeley, Rubak, and Turner 2015, 144–45).

There is another concept called the small world model. It assumes that points can only be observed in a finite small world and not beyond these boundaries. When thinking of an entire tissue, this is a very common scenario. Cells can only be observed within the tissue and not beyond. In this case, it would be correct to not assume a rectangular observation window but to use more conservative methods to estimate an unknown sampling window such as the Ripley-Rasson estimate of a spatial domain (Baddeley, Rubak, and Turner 2015, 144–45).

In both cases it is important to understand the direction of the bias. If the unknown window is estimated to be smaller than the true window, we underestimate the window. This then again leads to an overestimation of the density of points and to other characteristics of the pattern. Therefore, an underestimation of the window size is more concerning than a slight overestimation (Baddeley, Rubak, and Turner 2015, 144–45).

In our example dataset we analyse the mouse preoptic hypothalamus (Moffitt et al. 2018). This is a tissue of the mouse brain that is cut out of a bigger context. The lower boundary is the end of the tissue whereas the upper three boundaries are a technical boundary. Therefore, our example is a mixture between window sampling and the small world model. In order to decrease the bias of the tissue boarder, we use the Ripley-Rasson estimate of a spatial domain to estimate the sampling window.

Code
setRiprasWindows <- function(pp){
  Window(pp) <- ripras(pp)
  marks(pp) <- factor(marks(pp))
  return(pp)
}
#the entire point patterns with the ripras windows
pp <- lapply(pp_ls, setRiprasWindows)

separateMarks <- function(pp){
  #split the multitype point process into several single type processes
  ppls <- split(pp)
  return (ppls)
}
#the point patterns separated by their marks
pp_ls <- lapply(pp, separateMarks)
Code
par(mfrow=c(1,3))
#Plot the marks separately 
lapply(zstack_list, function(zstack){
  plot(pp_ls[[zstack]][[celltype_ls]], main = zstack, legend = FALSE)
})

[[1]]
Symbol map with constant values
cols: #000000E4

[[2]]
Symbol map with constant values
cols: #000000C2

[[3]]
Symbol map with constant values
cols: #0000007B
Code
dev.off()
null device 
          1 

Complete Spatial Randomness

Complete spatial randomness (CSR) is often used as the null model for various point patterns, and is the result of a Poisson process. A completely spatial random process is characterised by two properties, homogeneity and independence, as discussed below (Baddeley, Rubak, and Turner 2015, 132).

Homogeneity

“Homogeneity […] means that the expected number of points falling in a region B should be proportional to its area |B|” (Baddeley, Rubak, and Turner 2015, 132) given a proportionality constant \(\lambda\). The constant \(\lambda\) represents the intensity of the process, i.e., the average number of points in a unit area (Baddeley, Rubak, and Turner 2015, 132–33). :

\[ \mathbb{E}[X\cap B] = \lambda |B|. \label{eq:expected_number_points} \]

Independence

Independence implies that in two (non-overlapping) regions \(A\) and \(B\), the number of points \(n(X\cap A)\) and \(n(X\cap B)\) are independent random variables. In other words, the number of points in region \(A\) does not affect the number of points in region \(B\). In addition, the number of points, \(N = n(X\cap B)\), follows a Poisson distribution:

\[ \mathbb{P}[N=k] = e^{-\mu}\frac{\mu^k}{k!}\\ \label{eq:poisson_process} \] where \(k = \lambda |B|\) (Baddeley, Rubak, and Turner 2015, 133).

Inhomogeneous Poisson Process

A Poisson process that is spatially varying in its average density of points is called inhomogeneous. Here, the average density, \(\lambda(u)\), sometimes known as the intensity function (see below), is a function of spatial location \(u\). In this case, the expected number of points falling into a region \(B\), \(\mu = n(X\cap B)\), is an integration of the intensity function over that region (Baddeley, Rubak, and Turner 2015, 138).

\[ \mu = \int_{B} \lambda(u) du. \label{eq:expected_number_inhomogeneous} \]

Isotropy

A point process is called isotropic, if its statistical properties are invariant to rotations; a CSR process is both stationary and isotropic (Baddeley, Rubak, and Turner 2015, 147).

Stationarity

“A point process is called stationary if, when we view the process through a window W , its statistical properties do not depend on the location of the window in two-dimensional space” (Baddeley, Rubak, and Turner 2015, 146). This is the case for any homogeneous point process, where the statistical properties of the pattern are unchanged given shifting of the observation window. This means it is stationary in all statistical properties; first-order properties (e.g. intensity) and second-order properties (e.g. correlation) (Baddeley, Rubak, and Turner 2015, 218).

Correlation stationarity / Second-order stationarity

It is important to note that the metrics designed for inhomogeneous point processes should not be applied to every spatially inhomogeneous point process. They should only be applied to point processes that are “correlation stationary”, which specifies the correlation metrics only depends on the relative position in subpatterns of the point process, i.e., that estimates of the inhomogeneous metric should be similar in different subquadrats of the point pattern (Baddeley, Rubak, and Turner 2015, 689).

Local scaling

If a process is not correlation stationary, so the estimates of the inhomogeneous metric vary between locations, locally-scaled versions of the metric can be applicable. This means in small subregions, the process is still stationary and isotropic, but there is a rescaling factor that can vary across the total process (Baddeley, Rubak, and Turner 2015, 246–47).

We can use a permutation test to test the inhomogeneity assumption. In this scenario, we split the patterns into quadrats and compare the estimated functions between the quadrats. It should be noted that this test depends on the arbitrary definition of the quadrats. Given our chosen patterns are not independent but result as marks from an overall point-pattern, the permutation approach is questionable. Furthermore, the outcome of the permutation test depends heavily on the choice of the quadrats. Therefore, the interpretation can be difficult (Baddeley, Rubak, and Turner 2015, 689–93).

Code
permutation_test <- function(pp, mark, split, minpoints) {
  pp_sel <-  subset(pp, marks %in% mark, drop = TRUE)
  
  rho_est <- rhohat(unmark(pp_sel), "x", method="tr")
  lambda <- predict(rho_est)

  tesselation <- quantess(unmark(pp_sel), "x", 3)
  tesselation_split <- nestsplit(pp_sel, tesselation, ny=split)
  
  plot(tesselation_split, main = mark)
  
  tesselation_split$inten <- factor(as.integer(tesselation_split$f1) <= 1, labels=c("Hi","Lo"))
  
  res.scaled <- studpermu.test(tesselation_split, pts ~ inten, summaryfunction=Kscaled,
                 minpoints = minpoints)
  
  res.inhom <- studpermu.test(tesselation_split, pts ~ inten, summaryfunction=Kinhom,
                 lambda=lambda, minpoints = minpoints)
  
  #p-value of the local-scaling test
  print(paste0(mark,' local scaling test ', res.scaled$p.value))
  
  #p-value of the inhomogeneity test
  print(paste0(mark,' inhomogeneity test ', res.inhom$p.value))
}
lapply(c("Microglia", "OD Mature", "Ependymal"), function(x) permutation_test(pp[['0.01']], x, split = 3, minpoints = 10))

The p-value of the test for local scaling for microglia cells is \(<0.05\) which indicates that the assumption of local scaling is rejected. Therefore, the distribution of microglia cells is not a scaled version of an overall distribution pattern. The p-value of the test for inhomogeneity for both microglia cells is \(>0.05\) indicating that the assumption of correlation stationarity is not rejected. In this case we can use the inhomogeneous version of the K-function which assumes correlation stationarity.

For ependymal and OD mature cells however, the p-values for both the local scaling test and the inhomogeneity test are \(>0.05\) which means that both the correlation stationarity assumption and the local scaling assumption can’t be rejected. [ME: Does this make sense? Or is this just an artifact?]

As the interpretation of the permutation test is highly dependent on the quadrats, the results should be interpreted with care. Both inhomogeneous and locally scaled versions of the summary functions have support and both offer interesting insights into the spatial pattern. Therefore, we will compare all versions and show what the choice of metrics means for their interpretation.

Intensity

Intensity is the expected density of points per unit area. It can be interpreted as the rate of occurrence or the abundance of events recorded. The intensity represents a first order property because it is related to the expected number of points . More formally the average intensity of a point process is defined as:

\[ \bar{\lambda} = \frac{n(x)}{|W|} \label{eq:average_intensity} \]

As this is an average over the entire window, it only really makes sense for a homogeneous point process (Baddeley, Rubak, and Turner 2015, 157–60)

Estimating Intensity

For a homogeneous point process, the intensity can be estimated in a simplistic way: summing the individual intensities of the marks (Baddeley, Rubak, and Turner 2015, 161).

Code
intensityPointProcess <- function(pp,mark) if(mark) intensity(pp) else sum(intensity(pp))

intensityPointProcess(pp_ls[['0.01']], mark = FALSE) %>% round(6)
[1] 0.001909

Otherwise, we can compute the intensity for each mark individually.

Code
intensityPointProcess(pp_ls[['0.01']], mark = TRUE) %>% round(8)
  Ambiguous   Astrocyte Endothelial   Ependymal  Excitatory  Inhibitory 
 0.00024151  0.00020183  0.00014653  0.00008373  0.00036867  0.00061393 
  Microglia OD Immature   OD Mature   Pericytes 
 0.00003031  0.00006249  0.00014278  0.00001750 

Kernel Estimation

In kernel estimation, we try to estimate the intensity function \(\lambda(u)\) of the point process. There are a wide variety of kernel estimators (see (Baddeley, Rubak, and Turner 2015, 168)), but a popular choice is the isotropic Gaussian kernel where the standard deviation corresponds to the smoothing bandwidth (Baddeley, Rubak, and Turner 2015, 168).

Code
pp_sel <-  subset(pp_ls[['0.01']]$`OD Mature`, drop = TRUE)
Dens <- density(pp_sel, sigma = 100)
plot(Dens, main = 'Kernel Density (OD Mature cells)')

Quadrat Counting

In quadrat counting, all points falling into a given quadrat are counted. This gives an overview on the characteristics of the point pattern, such as correlation stationarity (Baddeley, Rubak, and Turner 2015, 163).

Code
Q5 <- quadratcount(pp_ls[['0.01']], nx=8, ny=8)
plot(unmark(pp[['0.01']]), main='Unmarked Point Pattern Quadrats')
plot(Q5, col='black', add=TRUE)

Under independence assumptions, the quadrat counts can be used for testing homogeneity, i.e., if the points are distributed evenly across the quadrats (Baddeley, Rubak, and Turner 2015, 164–65).

Code
val <- quadrat.test(pp_ls[['0.01']]$`OD Mature`, 5, alternative="regular", method="MonteCarlo")
val

    Conditional Monte Carlo test of CSR using quadrat counts
    Test statistic: Pearson X2 statistic

data:  pp_ls[["0.01"]]$`OD Mature`
X2 = 635.09, p-value = 1
alternative hypothesis: regular

Quadrats: 25 tiles (irregular windows)

A p-value of 1 indicates that the null hypothesis of irregularity can not be rejected strongly. Thus, the point pattern of oligodendrocyts is strongly irregular.

Alternatively, we can inspect departures from the hypothesis that points were generated by a Poisson process. We can identify hotspots and coldspots by comparing the standard error of the relrisk function, which computes nonparamatric estimates of the relative risk by kernel smoothing, to the theoretical null distribution of points. The relative risk is the ratio of spatially varying probablilities of different types (Buller 2020).

Code
# select marks
selection <- c('OD Mature', 'Ependymal', 'Microglia')
pp_sel <-  subset(pp[['0.01']], marks %in% selection, drop = TRUE)

f1 <- pValuesHotspotMarks(pp_sel)

# Plot significant p-values
plot(f1$p, main = "Significant difference\n to Poisson process alpha = 0.05")

Testing for CSR

Whether or not a point process is completely spatially random (CSR) depends on two characteristics: points need to be distributed homogeneously and they have to be independent of each other (see definitions above). There are various ways to test for CSR, here we show the use-case of the clark-evans test (Baddeley, Rubak, and Turner 2015, 165–66).

Code
clarkevans.test(pp_ls[['0.01']]$`OD Mature`)

    Clark-Evans test
    No edge correction
    Z-test

data:  pp_ls[["0.01"]]$`OD Mature`
R = 0.77286, p-value < 2.2e-16
alternative hypothesis: two-sided

Correlation

Correlation, or more generally covariance, represents a second-order summary statistic and measures dependence between data points (Baddeley, Rubak, and Turner 2015, 199).

Ripley’s \(K\)

Empirical Ripley’s \(K\)

In the framework of correlation analysis, we often look at distances \(d_{ij} = ||x_i-x_j||\) of all points. It is then natural to look at the summary of these distances, \(d_{ij}\), e.g. as a histogram. The histogram of this point process depends on the observation window \(W\), thus the histogram can change significantly with a different window. Therefore, we look at the empirical distribution function of the distances \(d_{ij}\) that are smaller or equal than a radius \(r\) (Baddeley, Rubak, and Turner 2015, 203)

What we actually measure here is “the average number of r-neighbours of a typical random point” (Baddeley, Rubak, and Turner 2015, 204). This number is still dependent on the size of the observation window so we can standardise it by the number of points and the window size, \(|W|\). We then obtain the empirical Ripley’s \(K\) function (Baddeley, Rubak, and Turner 2015, 204):

\[ \hat{K}(r) = \frac{|W|}{n(n-1)}\sum_{i=1}^n\sum_{j=1 \\j \neq i}^n\{d_{ij}\leq r\} e_{ij}(r) \]

The standardisation makes it possible to compare point patterns with different observation windows and with different numbers of points. However, using the empirical \(K\) function assumes though that the point process has homogeneous intensity, which is often not the case for biological tissue (Baddeley, Rubak, and Turner 2015, 204–5). We will return to this issue below in the Correcting for Inhomogeneity. The term \(e_{ij}(r)\) is for edge correction. We will briefly cover this in Edge effects and their corrections for spatial metrics

Edge effects and their corrections for spatial metrics

Edge effects describe the phenomenon that not the entire point process is observed, but rather only the part within the window \(W\). This means the value of various statistics could be biased along the edges (Baddeley, Rubak, and Turner 2015, 213).

There are many corrections for edge effects that are briefly listed here (Baddeley, Rubak, and Turner 2015, 214–19):

Border correction:

In border correction the summation of data points is restricted to \(x_i\) for which \(b(x_i,r)\) is completely in the window \(W\).

Isotropic correction:

We can regard edge effect as a sampling bias. Larger distances (e.g. close to the edges) are less likely to be observed. This can be corrected for.

Translation correction:

A stationary point process \(X\) is invariant to translations. So the entire point process can be shifted by a vector \(s\) to be at the position \(X+s\).

The \(L\)-function

The \(K\)-function can be ``centered’’, which is then called the \(L\)-function. The \(L\)-function is a variance-stabilising version of the \(K\)-function (Canete et al. 2022):

\[ L(r) = \sqrt{\frac{K(r)}{\pi}}. \]

Pair Correlation function

We have seen above that the \(K\)-function is cumulative. That is, the contributions of all distances smaller equal to \(r\) are considered. An alternative is to take the derivative of the \(K\)-function in order to obtain contributions of distances between points equal to \(r\), according to:

\[ g(r) = \frac{K'(r)}{2\pi r}, \]

where \(g(r)\) is the derivative of the \(K\) function (so interactions at exactly \(r\)) divided by the probability of a poisson process at this radius (Baddeley, Rubak, and Turner 2015, 225).

Code
#PRE: list of point pattern, corresponding celltypes of interest, functions to evaluate
#POST: result of the metric
metricRes <- function(plot_by, pp, celltype, fun, bootstrap, continuous, f){
  if(continuous){
    pp <- subset(pp, select = plot_by)
  }
  else{
    pp_sel <- pp[[plot_by]]
    pp <- subset(pp_sel, marks == celltype)
  }
  if(bootstrap){
    metric.res <- lohboot(pp, fun = fun, f = f)
  }
  else{
    metric.res <- do.call(fun, args = list(X=pp, f=f))
  }
  metric.res$type <- celltype
  metric.res$plot_by <- plot_by
  return(metric.res)
}

#PRE: celltypes, function to calculation and edge correction method
#POST: dataframe of 
metricResToDF <- function(plot_by, celltype, pp, fun, edgecorr, bootstrap, continuous, f){
  lapply(plot_by, function(u) {
    metricRes(u, fun = fun, pp = pp, celltype = celltype, bootstrap, continuous, f)  %>%
      as.data.frame()
  }) %>% bind_rows
}

# [MR: try to write the above a little more compactly]

#PRE: Celltypes of interest, function to analyse, edge correction to perform
#POST: plot of the metric
plotMetric <- function(plot_by, pp, celltype = NULL, x, fun, edgecorr, bootstrap = FALSE, continuous = FALSE, f = NULL){
  #calculate the metric and store as dataframe
  res_df <- metricResToDF(plot_by, celltype, pp, fun, edgecorr, bootstrap, continuous, f)
  #plot the curves
  p <- ggplot(res_df, aes(x=.data[[x]], y=.data[[edgecorr]], col= (plot_by)))+
      geom_line() +
      {if(bootstrap)geom_ribbon(aes(ymin = lo, ymax = hi), alpha = 0.25)}+
      ggtitle(paste0(fun, '-function'))+
      geom_line(aes(x=.data[[x]],y=theo),linetype = "dashed")+
      ylab(edgecorr) +
      # scale_color_manual(name='Point Processes',
      #                  breaks=c('-0.29', '0.01',
      #                           '0.06', 'Poisson'),
      #                  values=c('-0.29'='red',
      #                           '0.01'='dark green',
      #                           '0.06'='blue', 'Poisson'='black'))+
      theme_light()
  
  return(p)
}

p_K <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Kest', 'iso', bootstrap = TRUE)
p_L <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Lest', 'iso', bootstrap = TRUE)
p_g <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'pcf', 'border', bootstrap = TRUE)
Code
p_homo <- wrap_plots(list(p_K,p_L,p_g), guides = 'collect')
p_homo

As we have seen above in the test for homogeneity in oligodendrocytes, the assumptions of homogeneity are not given in our data. Therefore, we will have to use the inhomogeneous alternatives (inhomogeneity correction and local scaling) of the metrics instead.

Correcting for Inhomogeneity

Inhomogeneous \(K\)-function

In the case that a spatial pattern is known or suspected to be inhomogeneous, we have to take this into account in the analysis. Biological point patterns display inhomogeneity very often, therefore this analysis is preferred over the homogeneous alternatives. Inhomogeneous alternatives can be estimated via:

\[ \hat{K}_{inhom}(r) = \frac{1}{D^p|W|}\sum_i\sum_{j \neq i} \frac{\mathbb{1}\{||u-x_j||\leq r\}}{\hat{\lambda}(x_j)\hat{\lambda}(x_i)}e(x_j,x_i;r), \]

where \(e(u,v;r)\) is an edge correction weight and \(\hat{\lambda}(x_i)\) is an estimator of the intensity at point \(x_i\). The inhomogeneity correction happens via these \(\hat{\lambda}(x_i)\) per point \(x_i\). The estimation of these local intensities can happen in a data-dependent manner via kernel-smoothing. As this is the same data to then calculate the metric on, this can lead to biases. However, in the case where the local intensities are known, the inhomogeneous \(K\) function is an unbiased estimator (Baddeley, Rubak, and Turner 2015, 243–44).

Code
p_K <- plotMetric(plot_by = zstack_list, pp, celltype_ls,'r','Kinhom', 'iso', bootstrap = TRUE)
p_L <- plotMetric(plot_by = zstack_list, pp, celltype_ls,'r', 'Linhom', 'iso', bootstrap = TRUE)
p_g <- plotMetric(plot_by = zstack_list, pp, celltype_ls,'r', 'pcfinhom', 'border', bootstrap = TRUE)
Code
p_inhomo <- wrap_plots(list(p_K,p_L,p_g), guides = 'collect')
p_inhomo

The inhomogeneous \(K\)-function tells us that the microglia cells follow close to a Poisson process (dashed line) closely and can therefore be assumed to be randomly distributed and not clustered. Ependymal cells show a high degree of clustering at a low radius \(r\). OD mature cells exhibit a medium level of clustering.

In the \(L\)-function, the microglia cells are along the dashed Poisson line, indicating no clustering; ependymal cells are highly clustered at low values of \(r\), whereas OD mature show intermediate clustering.

The pair correlation function is the derivative of the \(K\)-function. The pcf plot gives similar information as before: microglia cells are around the dashed Poisson line. OD Mature cells show a rather broad range of correlations between \(r \in [20,100]\). Ependymal cells have a very strong correlation at \(\sim r = 25\).

Interestingly, the curves for the inhomogeneous functions of ependymal cells and OD mature cells cross the poisson line at \(r=300\). This means that the inhomogeneous functions find repulsion of ependymal cells and OD mature cells past a radius of \(r=300\).

Local Scaling

Locally-scaled \(K\)-function

In the inhomogeneous \(K\)-function approach above, we assume that the intensities can vary locally but the scale of the point process is not changed. This means that while the intensities might vary in the parts of the point pattern, the pattern in one subquadrat is not just a scaled version of another subquadrat. In a biological sample, this assumption is easily violated, e.g. when a gradient of cells increases from one side to another [ME needs a reference]. The correlation structure might scale linearly with the distance (Baddeley, Rubak, and Turner 2015, 246–47) (prokevsova2006statistics?).

To circumvent this local scaling, we can assume that the process is subdivided into small regions. In these small regions, the point process is a scaled version of a template process. This template process needs to be both stationary and isotropic (Baddeley, Rubak, and Turner 2015, 246–47).

Locally-scaled \(L\)-function

Since the \(L\)-function is simply a transformation of the \(K\)-function, the same local scaling framework can be applied to the \(L\)-function (Baddeley, Rubak, and Turner 2015, 246–47).

Code
p_K <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Kscaled', 'iso', bootstrap = FALSE)
p_L <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Lscaled', 'iso', bootstrap = FALSE)
#p_g <- plotScaledMetric(zstack_list,celltype_ls, pp_ls, 'pcfscaled', 'iso')$p
Code
p_scaled <- wrap_plots(list(p_K, p_L), guides = 'collect')
p_scaled

As seen by our permutation tests we have grounds to believe that the patterns of ependymal and OD mature cells are locally scaled. Therefore, we will analyse this pattern using a locally scaled K and L function.

We see, that again our ependymal cells are very far from the poisson process line, indicating strong clustering among ependymal cells. The OD mature cells range in the middle showing intermediate clustering

Code
wrap_plots(list(p_homo, p_inhomo, p_scaled), nrow = 3, guides = 'collect')

In the section on homogeneity vs. inhomogeneity above we have argued using permutation tests that our pattern is not homogeneous but rather inhomogeneous. In fact, both OD mature cells and ependymal cells even show local scaling in the permutation test. In the plot above we see all variants of the correlation metrics. Given the assumptions of either homogeneity (first row), inhomogeneity (second row) and local scaling (third row) changes the interpretation to some extent.

Deciding whether a pattern is homogeneous or not is easy and should always be done. In fact, most biological tissue will be inhomogeneous. The kind of inhomogeneity is more difficult to determine. Since the inhomogeneous curves (second row) and the local scaling curves (third row) are very different, it matters a lot to have good grounds to assume one or the other. Given our permutation tests above, we see that most support is found for local scaling metrics.

All the variants assume something about the properties of the metrics and given valuable insights. In the choice which metric fits the most, some care has to be taken.

Local Indicators of Spatial Association

It is worth noting that the \(K\)- and \(L\)-functions described above are summary statistics over the entire pattern (i.e., averaged over all points). However, if we know that there are different regions in our point pattern, an alternative strategy is to compute ``local’’ contributions to these patterns, i.e., local \(K\)- ,\(L\)- or pair-correlation functions. Baddeley et. al. propose to compare these \(n\) functions with so-called functional principal component analysis (see below). We will show here the example of the LISA version of the \(L\)-function (Baddeley, Rubak, and Turner 2015, 247–48).

Local \(L\) function
Code
L_odmature_lisa <- localL(pp_ls[['0.01']]$`OD Mature`)

df <- as.data.frame(L_odmature_lisa)
dfm <- reshape2::melt(df, "r")

get_sel <- dfm %>% filter(r > 200.5630 & r < 201.4388, variable != "theo") %>%
  mutate(sel = value) %>% select(variable, sel)

dfm <- dfm %>% left_join(get_sel)

p <- ggplot(dfm, aes(x=r, y=value, group=variable, colour=sel)) +
  geom_line() + 
  scale_color_continuous(type = "viridis") +
  geom_vline(xintercept = 200) +
  theme(legend.position = "none") +
  theme_light()

ppdf <- as.data.frame(pp[['0.01']]) %>% filter(marks=="OD Mature")
ppdf$sel <- get_sel$sel # assume they are in same order

q <- ggplot(ppdf, aes(x=x, y=y, colour=sel)) + 
  geom_point() +
  scale_color_continuous(type = "viridis") +
  theme(legend.position = "none") +
  theme_light()
Code
p|q

In the case of the OD mature cells, we obtain further information with this plot. We note that there are two distinct populations of curves: those that are clearly above the mean LISA curve in black and others that are around/underneath. This indicates that there are two different kinds of interactions in the OD mature cells. Stronger clustering and less clustered regions.

There are inhomogeneous versions of these (e.g. localLinhom) that are not shown here for brevity.

Functional PCA for the \(n\) Curves

We apply functional PCA to retrieve the main trends in these individual curves. The idea of functional PCA is the same as for ordinary PCA but applied to functional data (i.e., each observation is a function instead of a point). For the \(n\) functions above, functional PCA will recover the main trends in the data (Ramsay and Silverman 2005). We use the R package fdapace to perform functional PCA (Zhou et al. 2022).

Code
#adapted from the fdapace vignette
functional.pca.pp <- function(df){
  df_fdob <- asinh(df %>% as.matrix / 50)  # [MR: should we transform? if so, how?]
  #df_fdob <- df %>% as.matrix # [MR: should we transform? if so, how?]
  #remove theo column - we want only the actual estimations in there without the Poisson line theo
  if('r' %in% colnames(df) || 'theo' %in% colnames(df))
     df_fdob <- df_fdob[,!colnames(df_fdob) %in% c("r", "theo")]
  if('Ependymal' %in% colnames(df))
    df_fdob <- df_fdob[,!colnames(df_fdob) %in% c("trans",'iso')]

  #number of columns
  N <- ncol(df_fdob)
  #number of rows
  M <- nrow(df_fdob)
  #the x values at which all the curves were evaluated, here called tVec
  s <- df$r
  #create the FPCA object
  fd_obj <- fdapace::MakeFPCAInputs(IDs = rep(1:N, each=M),
                                    tVec=rep(s,N), yVec=df_fdob)
  print(which( unlist( lapply(fd_obj$Lt, 
                              function(x) length(x) != length(unique(x))))))
  #check that the FPCA object is valid
  fdapace::CheckData(fd_obj$Ly, fd_obj$Lt)
  #run the computation of the FPCA - would work with sparse data.
  fpca_obj <- fdapace::FPCA(fd_obj$Ly, fd_obj$Lt, 
                            list(plot = TRUE, dataType='Dense', kernel='rect'))
  fdapace::CreatePathPlot(fpca_obj,K = 3, pch = 4,
                          showObs = FALSE, showMean = TRUE)
  return(fpca_obj)
}

fpca_obj <- functional.pca.pp(L_odmature_lisa)

Code
fpca_pc_df <- as.data.frame(fpca_obj$xiEst) %>% 
  rename(PC1 = V1, PC2 = V2, PC3 = V3)
fpca_pc_df$sel <- get_sel$sel # assume they are in same order

ggplot(fpca_pc_df, aes(x=PC1, y=PC2, col = sel)) + 
  scale_color_continuous(type = "viridis") +
  geom_point() +
  theme_light() +
  ggtitle("Biplot of the LISA L curves of the OD mature cells")

Code
p1 <- ggplot(ppdf, aes(x=x, y=y, colour = fpca_pc_df$PC1)) + 
  scale_color_continuous(type = "viridis") +
  theme_light() +
  geom_point()

p2 <- ggplot(ppdf, aes(x=x, y=y, colour = fpca_pc_df$PC2)) + 
  scale_color_continuous(type = "viridis") +
  theme_light() +
  geom_point()

p1|p2

Here, we see the functional PCA for the OD mature cells. The Design plot tells us that we have a very dense dataset over the entire support [MR: do we need the ‘Design plot’? it doesn’t show much]. The mean curve displays the mean trend over all \(n\) LISA \(L\)-curves (which is itself similar to the locally-scaled \(L\)-function). The scree plot indicates that the first eigenfunction explains more than \(80 \%\) of the variance. The eigenfunction curves in the bottom right panel indicate the deviation from the mean curve.

Looking at the second plot, we see the smoothed mean curve and the individual curves that are reconstructed from the first three eigenfunctions. The first eigenfunction from the bottom right panel, \(\phi_1\), is above the mean curve, which relates to the population of curves above the mean. \(\phi_2\) is first above the mean curve and then lower than the mean curve. These curves are visible as well. Lastly, \(\phi_3\) is curves that start low and pick up to be larger than the mean curve in the end. This is visible in e.g. the orange dashed line (Ramsay and Silverman 2005).

The last plot shows the biplot of the loadings of the functional PCA. Each point is a cell from the OD mature cells with the loadings of the first two principal components plotted. The points are coloured as they were in the plots of the LISA \(L\)-curves. The first principal component clearly separates the two populations.

Third-Order Summary Statistics

So far we have considered first- and second-order summary statistics and local (or inhomogeneous) adaptations of them. In the second order, one considers (counts of) pairs (e.g., \(K\) function). In a third-order setting, we would count triplets of points. A triplet is counted as the normalised expected value of triangles where all edges are smaller than the radius r (Baddeley, Rubak, and Turner 2015, 249).

Spacing

So far, most approaches considered intensity and correlation as measures to assess a point pattern. Next, we will look at measures of spacing and shortest-distances to assess spatial arrangements (Baddeley, Rubak, and Turner 2015, 255).

Baddeley et.al. summarises three basic distances to measure:

  • pairwise distance: \(d_{i,j} = ||x_i-x_j||\)
  • NN distances: \(d_i = \min_{j \neq i}d_{ij}\)
  • empty-space distance: \(d(u) = \min_j||u-x_j||\)

Note also that there are tests of CSR that are based on spacing, including the Clark-Evans and Hopkins-Skellam Index tests that were discussed above ``Testing for CSR’’.

Nearest Neighbour approaches

Nearest neighbour (NN) methods are based on the notion of “nearness”. In particular, we introduce nndist from spatstat, a method to calculate the distances until \(k\) NN are found. This function returns a density for each specified \(k\) for the \(k\) neighbour distances. We can for instance collapse the \(k\) curves into a mean curve per point pattern. This information of the mean nearest neighbour distance (MMND) can be summarised as a density. Note, that these distances are “raw” nearest-neighbour distances which are not corrected for edge effects. Edge correction for the nearest neighbour distance (\(k = 1\)) is implemented in the function Gest below (Baddeley, Rubak, and Turner 2015, 256) (Baddeley and Turner 2005).

Code
nndistance <- function(pp, nk){
  xy <- cbind(pp$x, pp$y)
  nndistances_k15 <- nndist(xy, k = nk) 
  nndistances_mean <- rowMeans(nndistances_k15)
  return(nndistances_mean)
}

#PRE: list of point pattern, corresponding celltypes of interest, functions to evaluate
#POST: result of the metric
metricRes_nndist <- function(ppls, celltype, fun){
  metric.res <- list(res = do.call(fun, args = list(pp=ppls[[celltype]], nk = seq(1:15))))
  metric.res$type = celltype
  return(metric.res)
}
# [MR: again, this function looks again like those before and maybe could be done as an all-in-one wrapper.]
celltypes <- c("Ependymal", "OD Mature", "Microglia")
#go through all defined celltypes and calculate the nearest-neighbour distance
res_ls <- lapply(celltypes, metricRes_nndist, fun = nndistance, ppls = pp_ls[['0.01']])
#initialise a dataframe for the metric values and the type information
res_df <- data.frame(metric = numeric(0), type = character(0))
# Loop through the res_ls list and combine the metric values with their corresponding type - ChatGPT
for (i in 1:length(res_ls)) {
  metric_values <- res_ls[[i]]$res
  metric_type <- rep(res_ls[[i]]$type, length(metric_values))
  df <- data.frame(metric = metric_values, type = metric_type)
  res_df <- rbind(res_df, df)
}
#plot the densities
p <- ggplot(res_df, aes(x=metric, col= type))+
    geom_density(linewidth=1)+
    scale_x_sqrt() +
    theme_light() +
    ggtitle('Sqrt of the Mean Nearest-Neighbour Distance')
p

In the MNND empirical distribution, the ependymal cells show the shortest NN distances, a reflection of their clustering. The OD mature cells have larger NN distances as well as a bimodal distribution, indicating a mix of longer and wider distances (as visible in the LISA \(L\)-functions). Microglia cells show the widest distances and the symmetry of the curve indicates similar distances throughout the field of view.

Nearest-neighbour function \(G\) and empty-space function \(F\)

Definitions of \(F\) and \(G\) function

Under a stationary spatial point process, the empty-space distance is defined as:

\[ d(u,X) = \min\{||u-x_i||: x_i \in X\} \]

Note that this is an edge-corrected distribution function of the nearest-neighbour distance above.

[MR: what does one do for a non-stationary process?] [ME: that is a very good question, basically all point pattern methods assume stationarity]

The empty space function is then the cumulative distribution function of the empty-space distances defined above:

\[ F(r) = \mathbb{P}\{d(u,X)\leq r\}. \]

The NN distance is defined as:

\[ d_i = \min_{j\neq i}||x_j-x_i||. \]

The NN distance distribution function \(G(r)\) is then defined as:

\[ G(r) = \mathbb{P}\{d(x,X\backslash u \leq r |X\ has\ a\ point\ at\ u\}. \]

For a homogeneous Poisson process, the NN distance distribution is identical to the empty-space function of the same process:

\[ G_{pois} \equiv F_{pois}. \]

For a general point process, the \(F\) and \(G\) functions are different (Baddeley, Rubak, and Turner 2015, 261–64).

Empty-space hazard

The \(F\) and \(G\) functions are, like the \(K\) function, cumulative. The same disadvantages as with the \(K\) function occur here too [MR: what disadvantages? maybe say them explicitly here]. Therefore, an analogue to the pair-correlation function would make sense to consider. For practical reasons, this is no longer the derivative of the \(F\) function but rather a hazard rate:

\[ h(r) = \frac{f(r)}{1-F(r)}. \]

For a CSR process, the hazard rate is:

\[ h_{pois}(r) = 2 \pi \lambda r \]

(i.e., linear in \(r\)) (Baddeley, Rubak, and Turner 2015, 271–74).

\(J\)-Function

The concepts of the empty-space function \(F\) and the NN function \(G\) are somewhat complementary. If one decreases, the other increases.

Thus, a related approach is the \(J\) function:

\[ J(r) = \frac{1-G(r)}{1-F(r)}. \]

For a CSR process, \(J_{pois} \equiv 1\), whereas values of \(J(r) > 1\) are consistent with a regular (e.g., repelling) pattern, and $J(r) < 1 represents a clustered process (Baddeley, Rubak, and Turner 2015, 274–76).

Code
p_G <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Gest', 'rs', bootstrap = FALSE)
p_F <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Fest', 'rs', bootstrap = FALSE)
p_J <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Jest', 'rs', bootstrap = FALSE)
Code
wrap_plots(list(p_G,p_F,p_J), guides = 'collect')

Accounting for Inhomogeneity in Spacing Functions

[MR: so if there are inhomogeneous versions, should we skip the plots above? Or not, because the inhomogeneous versions are inaccurate because they still assume correlation stationarity. What do we do?!? We need to be careful not to paint overselves into a corner.]

There are inhomogeneous variants of the spacing functions explained above

Code
p_G <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Ginhom', 'bord')
p_F <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Finhom', 'bord')
p_J <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'r', 'Jinhom', 'bord')
Code
wrap_plots(list(p_G,p_F,p_J), guides = 'collect')

[ME: add some text to inhomogeneity]

Nearest-Neighbour Orientation

Next to the NN distance, we can estimate the orientation of the neighbours, which gives an indication of the orientation of the spacing. It works by taking the angle between each point and its \(k^{th}\) nearest neighbour. The angle is anticlockwise from the x-axis (Baddeley, Rubak, and Turner 2015, 278–79) (Baddeley and Turner 2005).

Code
p <- plotMetric(plot_by = zstack_list, pp, celltype_ls, 'phi', 'nnorient', 'bordm', bootstrap = FALSE)
p

The values of \(\phi\) correspond to the orientation of the point pattern. The horizontal axis goes from \(180\) to \(0\) (left to right) and the vertical from \(90\) to \(270\) (top to bottom) [MR: I find this description quite confusing, because you are not talking about the horizontal/vertical axis of the metric plot, but rather the orientations in the point patterns. And since certain \(\phi\)s represent these orientations, is it worth adding some shading of the x-axis to highlight this? Thicker lines would also benefit the old people reading this. And I wonder if faceting on celltype would help also. What does the Y-axis represent? And are there local variants of these? Maybe the clustered OD cells have a different orientation than the non-clustered?].

We can infer that the orientation of the Ependymal NNs is primarily along the vertical axis, OD mature cells do not show a clear orientation and microglial cells a horizontal orientation in their NNs with a modest peak at \(\sim 180\) (orientation to the top).

Note also tha the concepts of spacing are not only usable in point pattern analysis but also more broadly in other spatial contexts (e.g., spacing between shapes instead of points).

Edge Corrections

The same consideration about edge effects as for the \(K\) (and related) functions need to be made for the spacing functions; uncorrected estimates are negatively biased estimators. The easiest approach is to draw an artificial border and consider NNs within it. Other approaches are based on sampling. Yet another approach relates to survival analysis, with the idea is that a circle of a point to grows homogeneously with increasing radius until it hits the frame of the window and “dies”. This gives survival distributions similar to censored data, where the Kaplan-Meier estimator is the optimal choice (Baddeley, Rubak, and Turner 2015, 286–92).

Continuous Marks

Setup

Code
# redefine the pp here to be zstack 0.01
pp <- pp[['0.01']]
#subset the data to only look at sample ID 0.01
sub_2CT = spe[, spe$sample_id == "0.01" & spe$cluster_id %in% c("Astrocyte", "Inhibitory")]
(pp_2CT = .ppp(sub_2CT, marks = "cluster_id"))
Marked planar point pattern: 2611 points
marks are of storage type  'character'
window: rectangle = [1222.5635, 3012.4248] x [-3990.104, -2204.671] units
Code
sub <- spe[, spe$sample_id == '0.01']
#[MR: this above never gets used?]

In spatstat, a mark can basically take any value, discrete (as we have seen above) or continuous (e.g., gene expression). In our example, we take the gene expression of some marker genes from Fig. 6 of the original publication (Baddeley, Rubak, and Turner 2015, 637) (Moffitt et al. 2018). This is a typical numerical mark for points in a biological dataset.

Code
#  Genes from Fig. 6 of Moffitt et al. (2018)
genes <- c('Slc18a2', 'Esr1', 'Pgr')
gex <- assay(sub)[genes,] %>% t %>% as.matrix %>% 
  data.frame %>% set_rownames(NULL)
# gene expression to marks [ME: is it really expression?]
marks(pp) <- gex

TODO: better plotting?

Code
plot(pp)

Here we see spatial distribution of the counts of the three genes Slc18a2, Esr1 and Pgr. The size of the circles indicates the counts of the transcripts at that spot. [ME: is this true?] Since there are really a lot of points, we can’t easily distinguish general patterns of count distributions.

We can investigate the distribution of the marks against the spatial location of the points and against each other. This is done with the function pairs from spatstat. It generates a scatterplot of the counts of the marks (in our case the three genes) against each other and against the \(x\) and \(y\) coordinates. We can add a non-linear smoothing curve to make the general trends a bit more obvious (Baddeley, Rubak, and Turner 2015, 641).

Code
pairs(as.data.frame(pp), panel = panel.smooth, pch=".")

We find that the counts of the three genes are very evenly distributed along the \(x\) and \(y\) coordinate, indicating a homogeneous distribution. The counts of Esr1 and Pgr are positively associated, indicating a dependence of these two marks.

NN interpolations uses the nearest mark to measure the intensity at each spatial location. This is conceptually similar to taking a very small bandwidth for Gaussian kernel smoothing (Baddeley, Rubak, and Turner 2015, 642).

Code
plot(nnmark(pp))

We see that there is e.g. a clear spatial structure in the expression of e.g. Esr1. It shows a half moon shape.

Summary functions for continuous marks

As in the discrete case, summary functions assume that the point process is stationary.

Mark correlation function

The mark correlation function measures the dependence between two marks for two points at distance \(r\). It is applicable to stationary point processes with marks. It is not a correlation in the classical sense, since it cannot take negative values [MR: was this a typo? is this correct?]. Instead, a value of 1 indicates no correlation between the marks [MR: does this mean that negative correlation is not allowed? Or, would it appear as ‘less than 1’?]. The generalized mark correlation function is given by:

\[ k_f(r) = \frac{\mathbb{E}[f(m(u),m(v))]}{\mathbb{E}[f(M,M')]},\] where \(f(m_1,m_2)\) is a test function with two arguments (representing the two marks) and returns a non-negative value [MR: what are \(u\) and \(v\) in the formula above?]. For continuous non-negative marks, the canonical choice for \(f\) is typically \(f(m_1,m_2)= m_1 m_2\). \(M\) and \(M′\) represent independent, identically distributed random points with the same distribution as the mark of a randomly chosen point. This denominator is chosen such that random marks have a mark correlation of 1 (Baddeley, Rubak, and Turner 2015, 644–45).

Code
plotMetric(plot_by = genes,  pp = pp, x = 'r', fun = 'markcorr', edgecorr = 'iso', continuous = TRUE)

From this plot we show that all genes show a positive correlation at small distances which decline with increasing radius \(r\). The association is strongest for the Slc18a2 gene. We can calculate simulation envelopes to estimate the significance of this association. This is not shown for brevity.

Mark-weighted \(K\)-function

The mark-weigthed \(K\)-function is a generalization of the \(K\)-function in which the contribution from each pair of points is weighted by a function of their respective marks. It is given by:

\[K_f(r) = \frac 1 \lambda \frac{C_f(r)}{E[ f(M_1, M_2) ]},\] where:

\[ C_f(r) = E \left[ \sum_{x \in X} f(m(u), m(x)) 1\{0 < ||u - x|| \le r\} \; \big| \; u \in X \right], \]

is equivalent to the unnormalized mark-weighted \(K\)-function. For every point \(u\), we sum the euclidean distance \(||u - x||\) of all other points \(x\) that are within a distance \(r\). This sum is weighted by the function \(f(.,.)\) of the marks of \(u\) and \(x\). The function is standardized by the expected value of \(f(M_1, M_2)\) where \(M_1, M_2\) represent independent, identically distributed random points with the same distribution as the mark of a randomly chosen point (Baddeley, Rubak, and Turner 2015, 646–47).

In the scenario of random labeling, so where the labels or in this case marks are distributed randomly, the mark-weighted \(K\)-function corresponds to the standard Ripley’s \(K\)-function.

Also here, the canonical function is: \(f(m_1, m_2) = m_1 m_2\). This means we weigh each interaction between points by the product of the continuous marks of both points.

Code
plotMetric(plot_by = genes,  pp = pp, x = 'r', fun = 'Kmark', edgecorr = 'iso', continuous = TRUE, f = function(m1,m2){m1*m2})

It is important to note that the theoretical value of the \(K\)-function is not very informative since it represents the \(K\)-function of a Poisson point process and the underlying point process might not be Poisson. Therefore we compare the mark-weighted with its unmarked analogue. Like this, we can assess whether the points weighted by a continuous mark are more or less correlated than their unmarked analogues (Baddeley, Rubak, and Turner 2015, 647).

Here we will compare the \(L\)-functions weighted by the mark of the gene Esr1 and the unmarked \(L\)-function.

Code
ppEsr1 <- subset(pp, select = 'Esr1')
L.Esr1L <- Kmark(ppEsr1, function(m1,m2) {m1*m2}, returnL = TRUE)
Lest.ppEsr1 <- Lest(ppEsr1, nlarge=7000)
plot(eval.fv(L.Esr1L - Lest.ppEsr1))

We note that the difference between \(L\)-function weighted by the expression of Esr1 minus the unmarked \(L\)-function is positively different to the poisson difference, meaning that the expression of the continuous mark Esr1 is correlated among itself.

Summary

In this chapter we have looked at point pattern methods that can be applied to univariate marks. These univariate marks can either be discrete (in a molecular biological context that could be celltypes) or continuous (e.g. expression of a gene). There are approaches that summarise correlative dependencies among points or that consider spacing functions.

Appendix

Session info

Code
sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Zurich
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] magrittr_2.0.3                 stringr_1.5.0                 
 [3] dixon_0.0-8                    splancs_2.01-44               
 [5] spdep_1.2-8                    spData_2.3.0                  
 [7] tmap_3.3-4                     scater_1.28.0                 
 [9] scran_1.28.2                   scuttle_1.10.3                
[11] SFEData_1.2.0                  SpatialFeatureExperiment_1.2.3
[13] Voyager_1.2.7                  rgeoda_0.0.10-4               
[15] digest_0.6.33                  ncf_1.3-2                     
[17] sf_1.0-14                      reshape2_1.4.4                
[19] patchwork_1.1.3                STexampleData_1.8.0           
[21] ExperimentHub_2.8.1            AnnotationHub_3.8.0           
[23] BiocFileCache_2.8.0            dbplyr_2.3.4                  
[25] RANN_2.6.1                     seg_0.5-7                     
[27] sp_2.1-1                       rlang_1.1.1                   
[29] ggplot2_3.4.4                  dplyr_1.1.3                   
[31] mixR_0.2.0                     spatstat_3.0-6                
[33] spatstat.linnet_3.1-1          spatstat.model_3.2-6          
[35] rpart_4.1.19                   spatstat.explore_3.2-3        
[37] nlme_3.1-162                   spatstat.random_3.1-6         
[39] spatstat.geom_3.2-5            spatstat.data_3.0-1           
[41] SpatialExperiment_1.10.0       SingleCellExperiment_1.22.0   
[43] SummarizedExperiment_1.30.2    Biobase_2.60.0                
[45] GenomicRanges_1.52.1           GenomeInfoDb_1.36.4           
[47] IRanges_2.34.1                 S4Vectors_0.38.2              
[49] BiocGenerics_0.46.0            MatrixGenerics_1.12.3         
[51] matrixStats_1.0.0             

loaded via a namespace (and not attached):
  [1] spatstat.sparse_3.0-2         bitops_1.0-7                 
  [3] httr_1.4.7                    RColorBrewer_1.1-3           
  [5] numDeriv_2016.8-1.1           backports_1.4.1              
  [7] tools_4.3.1                   utf8_1.2.3                   
  [9] R6_2.5.1                      HDF5Array_1.28.1             
 [11] mgcv_1.8-42                   rhdf5filters_1.12.1          
 [13] withr_2.5.1                   gridExtra_2.3                
 [15] leaflet_2.2.0                 leafem_0.2.3                 
 [17] cli_3.6.1                     labeling_0.4.3               
 [19] proxy_0.4-27                  foreign_0.8-84               
 [21] R.utils_2.12.2                dichromat_2.0-0.1            
 [23] scico_1.5.0                   limma_3.56.2                 
 [25] rstudioapi_0.15.0             RSQLite_2.3.1                
 [27] generics_0.1.3                crosstalk_1.2.0              
 [29] Matrix_1.5-4.1                ggbeeswarm_0.7.2             
 [31] fansi_1.0.5                   abind_1.4-5                  
 [33] R.methodsS3_1.8.2             terra_1.7-55                 
 [35] lifecycle_1.0.3               yaml_2.3.7                   
 [37] edgeR_3.42.4                  rhdf5_2.44.0                 
 [39] tmaptools_3.1-1               grid_4.3.1                   
 [41] blob_1.2.4                    promises_1.2.1               
 [43] dqrng_0.3.1                   crayon_1.5.2                 
 [45] lattice_0.21-8                beachmat_2.16.0              
 [47] KEGGREST_1.40.1               magick_2.8.0                 
 [49] pillar_1.9.0                  knitr_1.44                   
 [51] metapod_1.7.0                 rjson_0.2.21                 
 [53] boot_1.3-28.1                 codetools_0.2-19             
 [55] wk_0.8.0                      glue_1.6.2                   
 [57] data.table_1.14.8             vctrs_0.6.4                  
 [59] png_0.1-8                     gtable_0.3.4                 
 [61] cachem_1.0.8                  xfun_0.40                    
 [63] S4Arrays_1.0.6                mime_0.12                    
 [65] DropletUtils_1.20.0           pracma_2.4.2                 
 [67] units_0.8-4                   statmod_1.5.0                
 [69] bluster_1.10.0                interactiveDisplayBase_1.38.0
 [71] ellipsis_0.3.2                bit64_4.0.5                  
 [73] filelock_1.0.2                irlba_2.3.5.1                
 [75] vipor_0.4.5                   KernSmooth_2.23-21           
 [77] Hmisc_5.1-1                   colorspace_2.1-0             
 [79] DBI_1.1.3                     nnet_7.3-19                  
 [81] raster_3.6-26                 tidyselect_1.2.0             
 [83] bit_4.0.5                     compiler_4.3.1               
 [85] curl_5.1.0                    htmlTable_2.4.1              
 [87] BiocNeighbors_1.18.0          DelayedArray_0.26.7          
 [89] checkmate_2.2.0               scales_1.2.1                 
 [91] classInt_0.4-10               rappdirs_0.3.3               
 [93] goftest_1.2-3                 fftwtools_0.9-11             
 [95] spatstat.utils_3.0-3          rmarkdown_2.25               
 [97] XVector_0.40.0                htmltools_0.5.6.1            
 [99] pkgconfig_2.0.3               base64enc_0.1-3              
[101] sparseMatrixStats_1.12.2      fastmap_1.1.1                
[103] htmlwidgets_1.6.2             shiny_1.7.5.1                
[105] DelayedMatrixStats_1.22.6     farver_2.1.1                 
[107] jsonlite_1.8.7                BiocParallel_1.34.2          
[109] R.oo_1.25.0                   BiocSingular_1.16.0          
[111] RCurl_1.98-1.12               Formula_1.2-5                
[113] GenomeInfoDbData_1.2.10       s2_1.1.4                     
[115] Rhdf5lib_1.22.1               munsell_0.5.0                
[117] Rcpp_1.0.11                   ggnewscale_0.4.9             
[119] viridis_0.6.4                 stringi_1.7.12               
[121] leafsync_0.1.0                MASS_7.3-60                  
[123] zlibbioc_1.46.0               plyr_1.8.9                   
[125] parallel_4.3.1                ggrepel_0.9.4                
[127] deldir_1.0-9                  Biostrings_2.68.1            
[129] stars_0.6-4                   splines_4.3.1                
[131] tensor_1.5                    locfit_1.5-9.8               
[133] igraph_1.5.1                  ScaledMatrix_1.8.1           
[135] BiocVersion_3.17.1            XML_3.99-0.14                
[137] evaluate_0.22                 BiocManager_1.30.22          
[139] httpuv_1.6.11                 purrr_1.0.2                  
[141] polyclip_1.10-6               rsvd_1.0.5                   
[143] lwgeom_0.2-13                 xtable_1.8-4                 
[145] fdapace_0.5.9                 e1071_1.7-13                 
[147] RSpectra_0.16-1               later_1.3.1                  
[149] viridisLite_0.4.2             class_7.3-22                 
[151] tibble_3.2.1                  memoise_2.0.1                
[153] beeswarm_0.4.0                AnnotationDbi_1.62.2         
[155] cluster_2.1.4                

References

Baddeley, Adrian, Ege Rubak, and Rolf Turner. 2015. Spatial Point Patterns. 1st ed. CRC Interdisciplinary Statistics Series. CRC Press, Taylor & Francis Group.
Baddeley, Adrian, and Rolf Turner. 2005. “Spatstat: An R Package for Analyzing Spatial Point Patterns.” Journal of Statistical Software 12 (January): 1–42. https://doi.org/10.18637/jss.v012.i06.
Buller. 2020. “Areas of a Spatial Segregation Model Significantly Different from Null Expectations.” Ian Buller, Ph.D., M.A. https://idblr.rbind.io/post/pvalues-spatial-segregation/.
Canete, Nicolas P, Sourish S Iyengar, John T Ormerod, Heeva Baharlou, Andrew N Harman, and Ellis Patrick. 2022. spicyR: Spatial Analysis of in Situ Cytometry Data in R.” Bioinformatics 38 (11): 3099–3105. https://doi.org/10.1093/bioinformatics/btac268.
Moffitt, Jeffrey R., Dhananjay Bambah-Mukku, Stephen W. Eichhorn, Eric Vaughn, Karthik Shekhar, Julio D. Perez, Nimrod D. Rubinstein, et al. 2018. “Molecular, Spatial, and Functional Single-Cell Profiling of the Hypothalamic Preoptic Region.” Science 362 (6416): eaau5324. https://doi.org/10.1126/science.aau5324.
Ramsay, JO, and BW Silverman. 2005. “Principal Components Analysis for Functional Data.” Functional Data Analysis, 147–72.
Zhou, Yidong, Satarupa Bhattacharjee, Cody Carroll, Yaqing Chen, Xiongtao Dai, Jianing Fan, Alvaro Gajardo, et al. 2022. Fdapace: Functional Data Analysis and Empirical Dynamics. https://github.com/functionaldata/tPACE.